iOS MachineLearning 系列(6)—— 视频中的物体轨迹分析
轨迹分析是比物体追踪更上层的一种应用。Vision框架中提供了检测视频中多个物体的运动轨迹等能力,在健身,体育类应用中非常有用。
轨迹检测需要一系列的运动状态来分析,因此这类的请求是有状态的,有状态的请求可以被句柄多次调用,其会自动记录之前的状态,从而进行轨迹路径分析。需要注意,在进行轨迹检测时,要保证摄像机的相对静止,镜头的移动可能会影响检测的准确性。
在日常生活中,我们可以使用轨迹检测来进行投球的矫正,球类落点的推测等等。
1 - 解析视频中的物体飞行轨迹
轨迹检测需要保存状态,因此其传入的图像分析参数需要为包含CMTime信息的CMSampleBuffer数据。对于一个视频文件,我们首先要做的是将其中的图像帧解析出来,即获取到CMSampleBuffer数据。示例代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| func detectTrajectories() { let videoURL = URL(fileURLWithPath: Bundle.main.path(forResource: "video2", ofType: ".mov")!) let asset = AVAsset(url: videoURL) guard let videoTrack = asset.tracks(withMediaType: .video).first else { return } let frameRate = videoTrack.nominalFrameRate let frameDuration = CMTime(seconds: 1 / Double(frameRate), preferredTimescale: CMTimeScale(NSEC_PER_SEC)) let assetReaderOutputSettings: [String: Any] = [ kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA ] let assetReaderOutput = AVAssetReaderTrackOutput(track: videoTrack, outputSettings: assetReaderOutputSettings) let assetReader = try! AVAssetReader(asset: asset) assetReader.add(assetReaderOutput) if assetReader.startReading() { while let sampleBuffer = assetReaderOutput.copyNextSampleBuffer() { autoreleasepool { if CMSampleBufferDataIsReady(sampleBuffer) { let timestamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer) processFrame(sampleBuffer, atTime: timestamp, withDuration:frameDuration) } } } } }
|
processFram方法进行轨迹分析,实现如下:
1 2 3 4 5 6
| func processFrame(_ sampleBuffer: CMSampleBuffer, atTime time : CMTime, withDuration duration : CMTime) { let handler = VNImageRequestHandler(cmSampleBuffer: sampleBuffer, orientation: .up) try? handler.perform([request]) }
|
request对象的构建如下:
1 2 3 4 5 6 7 8 9
| lazy var request: VNDetectTrajectoriesRequest = { let req = VNDetectTrajectoriesRequest(frameAnalysisSpacing:.zero, trajectoryLength: 10) { result, error in if let error { print(error) } self.handleResult(request: result as! VNDetectTrajectoriesRequest) } return req }()
|
这里的参数后面会详细解释。
在示例中,我们可以添加一个AVPlayer来播放原视频,然后将分析出的轨迹绘制到视频对应的位置上进行对比。handleResult方法示例如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| func handleResult(request: VNDetectTrajectoriesRequest) { for res in request.results ?? [] { let points = res.projectedPoints for p in points { DispatchQueue.main.async { let v = UIView() let scale = self.image.size.width / self.image.size.height let width = self.view.frame.width let height = width / scale let size = CGSize(width: width, height:height) v.backgroundColor = .red let offsetY = self.view.frame.height / 2 - height / 2 v.frame = CGRect(x: p.x * size.width, y: (1 - p.y) * size.height + offsetY, width: 4, height: 4) self.view.addSubview(v) } } } }
|
轨迹分析效果如下所示:
2 - VNDetectTrajectoriesRequest与VNTrajectoryObservation 类
VNDetectTrajectoriesRequest类一种有状态的分析请求类,继承自VNStatefulRequest,VNDetectTrajectoriesRequest定义如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| open class VNDetectTrajectoriesRequest : VNStatefulRequest { public init(frameAnalysisSpacing: CMTime, trajectoryLength: Int, completionHandler: VNRequestCompletionHandler? = nil) open var trajectoryLength: Int { get } open var objectMinimumNormalizedRadius: Float open var minimumObjectSize: Float
open var objectMaximumNormalizedRadius: Float open var maximumObjectSize: Float open var targetFrameTime: CMTime
open var results: [VNTrajectoryObservation]? { get } }
|
VNTrajectoryObservatio类是轨迹分析的结果类,其内封装了组成轨迹的点。定义如下:
1 2 3 4 5 6 7 8 9 10
| open class VNTrajectoryObservation : VNObservation { open var detectedPoints: [VNPoint] { get } open var projectedPoints: [VNPoint] { get } open var equationCoefficients: simd_float3 { get } open var movingAverageRadius: CGFloat { get } }
|
其中equationCoefficients属性是模拟出的抛物线方程,即下面的公式:
y = ax^2 + bx + c
simd_float3结构中会封装a,b和c的值。